International Journal of Power Electronics and Drive System (IJPEDS) 
Vol. 12, No. 1, March 2021, pp. 551~557 
ISSN: 2088-8694, DOI: 10.1159 1/ijpeds.v12.i1.pp551-557 m) 551 


Adaptive dynamic programming algorithm for uncertain 
nonlinear switched systems 


Dao Phuong Nam!, Nguyen Hong Quang”, Nguyen Nhat Tung’, Tran Thi Hai Yent 
! School of Electrical Engineering, Hanoi University of Science and Technology, 
Bach Khoa, Hai Ba Trung, Ha Noi, Vietnam 
24Thai Nguyen University of Technology, So 666 D. 3/2, P, Thanh pho Thái Nguyên, Thai Nguyên, Vietnam 
Electric Power University, 235 Hoang Quoc Viet, Co Nhue, Tu Liém, Ha Noi 129823, Vietnam 











Article Info ABSTRACT 
Article history: This paper studies an approximate dynamic programming (ADP) strategy of a group of 
Received Feb 2. 2020 nonlinear switched systems, where the external disturbances are considered. The neu- 


ral network (NN) technique is regarded to estimate the unknown part of actor as well as 
critic to deal with the corresponding nominal system. The training technique is simul- 
taneously carried out based on the solution of minimizing the square error Hamilton 
function. The closed system’s tracking error is analyzed to converge to an attraction 


Revised Dec 15, 2020 
Accepted Jan 10, 2021 








Keywords: region of origin point with the uniformly ultimately bounded (UUB) description. The 

g . simulation results are implemented to determine the effectiveness of the ADP based 
Adaptive dynamic 

: controller. 
programming 
HJB equation 
Lyapunov This is an open access article under the CC BY-SA license. 
Neural networksstability ®© © 
Nonlinear switched systems (cc) 
BY SA 

Corresponding Author: 
Nguyen Hong Quang 


Thai Nguyen University of Technology 
So 666 D. 3/2, P, Thanh pho Thai Nguyén, Thai Nguyén, Vietnam 
Email: quang.nguyenhong @tnut.edu.vn 








1. INTRODUCTION 

It is worth noting that many systems in industry can be described by switched system such as DC- 
DC converter [1]-[3], H-bridge inverter [4], multilevel inverter [5], photovoltaic inverter [6]. Although many 
different approaches for switched systems have been proposed, e.g., switching-delay tolerant control [7], clas- 
sical nonlinear control [8]-[12], the optimization approaches with the advantage of mentioning the input/state 
constraint has not been mentioned much. The approaches of fuzzy and neural network as well as ANN, par- 
ticle swarm optimization (PSO) technique were investigated in several different systems such as photovoltaic 
inverter, transmission line. [13]-[17]. 

Adaptive dynamic programming has been considered in many situations, such as nonlinear continuous 
time systems [18], actuator saturation [19], linear systems [20]-[22], output constraint [23]. In the case of non- 
linear systems, the algorithm should be implemented based on Neural Networks (NNs). However, Kronecker 
product was employed in linear systems. Furthermore, the data driven technique should to be mentioned to 
compute the actor/critic precisely. It should be noted that the robotic systems has been controlled by ADP 
algorithm [24]-[25]. 

Our work proposed the solution of adaptive dynamic programming in nonlinear perturbed switching 
systems based on the neural networks. The consideration of the Halminton function enables us obtaining the 
learning technique of these neural networks. The UUB stability of closed system is analyzed and simulation 
results illustrate the high effectiveness of given controller. 
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2. PROBLEM STATEMENTS 
Consider the following uncertain nonlinear continuous time switched systems of the form: 


L ele) = fi (E) + gi (E(t) (u + A (6,t)) (1) 


where €(t) E€ Q, € R” denotes the state variables and u(t) € Qu, € R” describes the control variables. 
The function 8 : [0, +20) => Q = {1,2,...,/} is a information of switching processing, which is known as 
a function with many continuous piecewise depending on time, and / is the subsystems number. f; (€) are 
uncertain smooth vector functions with f; (0) = 0. g; (£) are mentioned as smooth vector functions with the 
property Gmin <S ||gi (€) || < Gmax. The switching index 8 (t) is unknown. 

Assumption 1: A (£, t) is bounded by a certain function o (£) as ||A (£, t)|| < o (£) 

Consider the cost function connected with the uncertain switched system (1): 





Co 


iGo J r (E(r),(r)) dr (2) 


t 


where r(€,u) = €7Qé + uT Ru and Q = QT > 0; R= RT > 0. 

The main purpose is to achieve the state feedback control design and give the upper bound term to 
guarantee the closed systems under this controller is robustly stable. Additionally, the performance index (2) is 
bounded as J < K (€,u) < M. 

Definition: The term K (u) is given by the appropriate performance index. As a result, the control 
input u* = arg min K (£, u) is mentioned as the optimal appropriate performance index method. 


3. CONTROL DESIGN 
The obtained nominal system after eliminating the disturbance in switched system (3) is described by: 


d 
E= file) +a (Gu 6) 
The performance index of system (3) is modified as (4) 
aeu =f [Gu +100!) a 4) 


t 


We prove that Qi(€,u) with y > ||R|| is the one of appropriate performance indexes of dynamical 
system (1). Define: V* (t) = min Q1(&, u), we have (5) 
UEQay 





V= = min f {rE +wO}a (5) 
t 
based on nominal system and cost function (4), it leads to Halminton function as (6) 
+ 2 oV* 5 


by using optimality principle, the optimal control input can be obtained as (7). 
r OV* 
ag 


We continue to utilize this control law (7) for nonlinear continuous SW system (1) and obtain that: 
Theorem 1: The system (1) under the controller u*(€) = —4 R7! (g; (€))" 2y- is stable with the 
associated Lyapunov function candidate: 





u*(€) = -ZR (g; (©) o) 
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V= f Pentre 8) 
t 
where y > ||RI|. 
Proof: Taking the derivative of V under the control input u (€) = —4 R7! (g; (€ ))" VV*, we imply 
that (9): 
d 
EV = £7 QE - (70? (€) — AG 8" RA (6,0)) — (u + AEH)” R (u+ A EO) (9) 


It is able to conclude that (10): 


V (t) < -€T QE (10) 


Therefore, the system (1) is robustly stable. However, it is impossible to solve directly HJB equation. 
Hence, the optimal performance index V* for system (3) can be described based on a NN as (11) 


V* = wo (£) + e (£) (11) 


where o (x) : R” + RN;c(0) = 0, w € RY is the NN constant weight vector. ø (x) can be found to 

guarantee that when N — oo, we have: e (£) — 0 and Ve (£) — 0, so for fixed N, we can assume that: 
Assumption 2: ||e (€)|] < Emax; ||Ve (€)|]| < Vemax} Vomin < ||Vo (§)|| < Vomax; |||] < Wmax- 
Combining two formulas (10) and (11) we imply (12) 


(VV*)" gi (€) Rg (T (VV*) =0 A9 


H (Eu, V*) = ETQE HAP (E HVT AE- T 


Formula (19) leads to (13). 
VV* = (Vo (é))’ w + Ve (£) (13) 
Obtain the description as (14). 


evn = -Ve (E)? (Fi (©) + gi (E) w) + EVE (E) gi (€) Rg ©)" Ve) (14) 


It follows that ey y converges uniformly to zero as N — oo. For each number N, eny is bounded 
on a region as enV < €max. Under the structure of ADP-based controller, a critic NN is computed as (15). 


V =o (€) = o (6)? 0; 0 = -ER (GO) VÝ (15) 
It is able to achieve that: 
ense = ETQE + AP O40" Vo (€) fi (©) FO"VeOHOR WO Voo 16) 


The training law is handled based on a steepest descent method: 


din OE 


: 1 
with E = SCH jpeH se: 

Remark 1: The weight © is trained to minimize the network error part G = leh JpeHJB. This result 
is obtained from (18). 


ðG aGy\? 
26 =o ($) (18) 
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Theorem 2: Consider the feedback controller in (15) and the critic weight is updated by (18), the 
weight estimate error Ù = w — ù and the closed system’s state vector x(t) are uniformly ultimately bounded 
(UUB). 

Proof: Let’s choose the Lyapunov function: 


V (t) = Vi (t) + Va(t), where: Vi (t) = +ü OT w(t), Vo (t) = V* (19) 


Using the Assumption 3: || f; (€) + gi (£) u* || < Pmax and the definition: 
pi = fi (Ẹ) + gi (6) u*; Gi = gi (£) R719; (©)" ; Vo = Vo (6); Ve = Ve (€). Taking the derivative of V: (t), 
we imply that: 


1 1 
Vi (t) = -0T (~eww + 0? Vou + z% VoGiVe + {iT VoGvora) 


Vo (x) (n + 5c (Vol wt ve)) (20) 


It leads to the estimation: V; (t) < —7. For the term V2(t) , from (20) we have (21). 





Va = (VVT (fs +g: (8+ A)) == (ETRE +9?) - GOV")? 


oR Gf (VV*) + 3 (VV)? giR-'gF (Vo (6)? 0 + Ve (€)) + (VV)? GA Qu) 
Assume that p (£) = @ ||&||. From (40) we have (22). 


V2 < — (Amin (Q) + xv) IE? + 0? (22) 
with 62 = —1(VV*)" g:R-1gf (VV*) + 4(VV*)" g ROG? (Vo (a) +V VV*)" giA 
=I Gi Gj 2 Gi Gi a(x) OW Vea) + ge: 
Based on the two above assumptions, we have (23). 


6? <S (Wmax V Tmax + Tend Gee nese (ae) + ; (OV omax + VEmax)? ie ae nee (Rt) 


au (Wmax V Omax + VEmax) JGmaxW ilz (23) 


It is obvious that (Amin (Q) + A) ||zx||? — 62 > m2 with 72 > 0 and we obtain (24). 


Va (t) < —m2 (24) 


Remark 2: The coefficients 1; V2 can be chosen by renovating the NN of the optimal performance 
index. Moreover, for arbitrary switching index, after ame ss the variable ||£|| and ||7|| tend to the accurate 


domains. The ADP controller û is proposed in (15), which tends to the neighborhood of u*. 
Proof: The deviation of control input is estimated as (25). 


l-u" = 5 | ©)" (Wo) o+Ve®)| 


1 
< 3 max (Rt) Guas: (Vomax-V1 ae VeEmax) ad Os (25) 


Thus the proof is completed. 
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4. SIMULATION RESULTS 


In this section, we consider the simulations to validate the performance of the established control 
scheme: Let N = 2 and the subsystems of the switched system are (26) and (27). 


tı = —23 — 2x2 + (u + Ay (2, #)) (26) 
t2 = £1 + 0.5 cos (x?) sin (x3) — (u + Aj (2,2) 
t= —x} sin (x2) + (u + Ag (x, t)) (27) 
ty = ir — cos (x1 ) cos (z3) — (u + Ag (x, t)) 
The initial state vectors can be chosen as (28). 
z(o)=[5 T (8) 


2 0 1 0 
0 2 |? Q= 0 3 
The simulation results shown in Figure 1 and Figure 2 validate the effectiveness of proposed algorithm. 


Choosing that the parameter matrices: R = | ;a=0.1;A\ = 5. 




















time time 


Figure |. The response of x2 Figure 2. The response of x2 


5. CONCLUSION 

This paper has investigated the ADP problem of switched nonlinear systems under the external dis- 
turbance. We consider previously for nominal system by eliminating the disturbance, then using classical 
nonlinear control technique. The neural networks have been designed to estimate the actor and critic NN of 
iteration. It is possible to develop the learning algorithm with simultaneous tuning. Finally, UUB description 
of the closed system is guaranteed under this work. 


ACKNOWLEDGEMENT 
This research was supported by research foundation funded by Thai Nguyen University of Technology. 


REFERENCES 

[1] Vu, Tran Anh and Nam, Dao Phuong and Huong, Pham Thi Viet, “Analysis and control design of transformerless high 
gain, high efficient buck-boost DC-DC converters,” in 2016 IEEE International Conference on Sustainable Energy 
Technologies (ICSET), Hanoi, 2016, pp. 72-77, doi: 10.1109/ICSET.2016.7811759. 

[2] Nam, Dao Phuong and Thang, Bui Minh and Thanh, Nguyen Truong, “Adaptive Tracking Control for a Boost DC— 
DC Converter: A Switched Systems Approach,” in 2018 4th International Conference on Green Technology and 
Sustainable Development (GTSD), Ho Chi Minh City, 2018, pp. 702-705, doi: 10.1109/GTSD.2018.8595580. 


Adaptive dynamic programming algorithm for uncertain nonlinear switched systems (Dao Phuong Nam) 


556 


[3] 


[4] 


[5] 
[6] 


[7] 


[8] 


[9] 


[10] 


[11] 


[12] 


[13] 


[14] 


[15] 


[16] 


[17] 


[18] 


[19] 


[20] 


[21] 


[22] 


[23] 


0 ISSN: 2088-8694 


Thanh, Nguyen Truong and Sam, Pham Ngoc and Nam, Dao Phuong, “An Adaptive Backstepping Control for 
Switched Systems in presence of Control Input Constraint,” in 2019 International Conference on System Science 
and Engineering (ICSSE), Dong Hoi, Vietnam, 2019, pp. 196-200, doi: 10.1109/ICSSE.2019.8823125. 

Panigrahi, Swetapadma and Thakur, Amarnath, “Modeling and simulation of three phases cascaded H-bridge grid- 
tied PV inverter,’ Bulletin of Electrical Engineering and Informatics (BEED, vol. 8, no. 1, pp. 1-9, 2019, doi: 
10.1159 1/eei.v8il.1225. 

Devarajan, N and Reena, A, “Reduction of switches and DC sources in Cascaded Multilevel Inverter,’ Bulletin of 
Electrical Engineering and Informatics (BEED), vol. 4, no. 3, pp. 186-195, 2015, doi: 10.11591/eei.v4i3.320. 
Venkatesan, M and Rajeshwari, R and Deverajan, N and Kaliyamoorthy, M, “Comparative study of three phase grid 
connected photovoltaic inverter using pi and fuzzy logic controller with switching losses calculation,” International 
Journal of Power Electronics and Drive Systems (IJPEDS), vol. 7, no. 2, pp. 543-550, 2016. 

Zhang, Lixian and Xiang, Weiming, “Mode-identifying time estimation and switching-delay tolerant control 
for switched systems: An elementary time unit approach,’ Automatica, vol. 64, pp. 174-181, 2016, doi: 
10.1016/j.automatica.2015.11.010. 

Yuan, Shuai and Zhang, Lixian and De Schutter, Bart and Baldi, Simone, “A novel Lyapunov function for a 
non-weighted L2 gain of asynchronously switched linear systems,” Automatica, vol. 87, pp. 310-317, 2018, doi: 
10.1016/j.automatica.2017.10.018. 

Xiang, Weiming and Lam, James and Li, Panshuo, “On stability and H control of switched systems with random 
switching signals,’ Automatica, vol. 95, pp. 419-425, 2018, doi: 10.1016/j.automatica.2018.06.001. 

Lin, Jinxing and Zhao, Xudong and Xiao, Min and Shen, Jingjin, “Stabilization of discrete-time switched singular 
systems with state, output and switching delays,” Journal of the Franklin Institute, vol. 356, pp. 2060-2089, 2019, 
doi: 10.1016/j.jfranklin.2018.11.034. 

Briat, Corentin, “Convex conditions for robust stabilization of uncertain switched systems with guaranteed 
minimum and mode-dependent dwell-time,’ Systems & Control Letters, vol. 78, pp. 63-72, 2015, doi: 
10.1016/j.sysconle.2015.01.012. 

Lian, Jie and Li, Can, “Event-triggered control for a class of switched uncertain nonlinear systems,’ Systems & 
Control Letters, vol. 135, pp. 1-5, 2020, doi: 10.1016/j.sysconle.2019.104592. 

Anyaka, Boniface O and Manirakiza, J Felix and Chike, Kenneth C and Okoro, Prince A, “Optimal unit commitment 
of a power plant using particle swarm optimization approach,” International Journal of Electrical and Computer 
Engineering (IJECE), vol. 10, no.2, pp. 1135-1141, 2020, doi: 10.1159 1/ijece.v10i2.pp1135-1141. 

Devi, Palakaluri Srividya and Santhi, R Vijaya, “Introducing LQR-fuzzy for a dynamic multi area LFC-DR 
model,’ International Journal of Electrical & Computer Engineering, vol. 9, no. 2, pp. 861-874, 2019, doi: 
10.1159 1/ijece.v9i2.pp86 1-874. 

Omar, Othman AM and Badra, Niveen M and Attia, Mahmoud A, “Enhancement of on-grid pv system under irra- 
diance and temperature variations using new optimized adaptive controller,’ International Journal of Electrical and 
Computer Engineering (IJECE), vol. 8, no. 5, pp. 2650-2660, 2018, doi: 10.1159 1/ijece.v8i5.2650-2660. 

Sharma, Purva and Saini, Deepak and Saxena, Akash, “Fault detection and classification in transmission line using 
wavelet transform and ANN,” Bulletin of Electrical Engineering and Informatics (BEED), vol. 5, no. 3, pp. 284-295, 
2016. 

Tlamathi, P and Selladurai, V and Balamurugan, K, “Predictive modelling and optimization of nitrogen oxides emis- 
sion in coal power plant using Artificial Neural Network and Simulated Annealing,’ IAES International Journal of 
Artificial Intelligence (IJ-AD, vol. 1, no. 1, pp. 11-18, 2012. 

Vamvoudakis, Kyriakos G and Vrabie, Draguna and Lewis, Frank L, “Online adaptive algorithm for optimal control 
with integral reinforcement learning,” International Journal of Robust and Nonlinear Control, vol. 24, no. 17, pp. 
2686-2710, 2013, doi: 10.1002/rnc.3018. 

Bai, Weiwei and Zhou, Qi and Li, Tieshan and Li, Hongyi, “Adaptive reinforcement learning neural network control 
for uncertain nonlinear system with input saturation,” IEEE transactions on cybernetics, vol. 50, no. 8, pp. 3433-3443, 
Aug. 2020, doi: 10.1109/TCYB.2019.2921057. 

Chen, Ci and Modares, Hamidreza and Xie, Kan and Lewis, Frank L and Wan, Yan and Xie, Shengli, “Reinforcement 
learning-based adaptive optimal exponential tracking control of linear systems with unknown dynamics,” in IEEE 
Transactions on Automatic Control, vol. 64, no. 11, pp. 4423-4438, Nov. 2019, doi: 10.1109/TAC.2019.2905215. 
Vamvoudakis, Kyriakos G and Ferraz, Henrique, “Model-free event-triggered control algorithm for continuous- 
time linear systems with optimal performance,’ in Automatica, vol. 87, pp. 412-420, 2018, doi: 
10.1016/j.automatica.2017.03.013. 

Gao, Weinan and Jiang, Yu and Jiang, Zhong-Ping and Chai, Tianyou, “Output-feedback adaptive optimal control of 
interconnected systems based on robust adaptive dynamic programming,” Automatica, vol. 72, pp. 37-45, 2016, doi: 
10.1016/j.automatica.2016.05.008. 

Zhang, Tianping and Xu, Haoxiang, “Adaptive optimal dynamic surface control of strict-feedback nonlinear systems 
with output constraints,” International Journal of Robust and Nonlinear Control, vol. 30, no. 5, pp. 2059-2078, 2020, 


Int J Pow Elec & Dri Syst, Vol. 12, No. 1, March 2021 : 551 — 557 


Int J Pow Elec & Dri Syst ISSN: 2088-8694 o 557 


doi: 10.1002/mc.4864. 

[24] Wang, Ding and Mu, Chaoxu, “Adaptive-critic-based robust trajectory tracking of uncertain dynamics and its appli- 
cation to a spring—mass—damper system,” IEEE Transactions on Industrial Electronics, vol. 65, no. 1, pp. 654-663, 
Jan. 2018, doi: 10.1109/TIE.2017.2722424. 

[25] Wen, Guoxing and Ge, Shuzhi Sam and Chen, CL Philip and Tu, Fangwen and Wang, Shengnan, “Adaptive tracking 
control of surface vessel using optimized backstepping technique,” [EEE transactions on cybernetics, vol. 49, no. 9, 
pp. 3420-3431, Sept. 2019, doi: 10.1109/TCYB.2018.2844177. 


Adaptive dynamic programming algorithm for uncertain nonlinear switched systems (Dao Phuong Nam) 


